Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Busco #1239

Merged
merged 32 commits into from
May 6, 2022
Merged

Busco #1239

merged 32 commits into from
May 6, 2022

Conversation

priyanka-surana
Copy link
Contributor

Closes #569

This BUSCO module was written from scratch with nf-core v2.2 template.

@priyanka-surana priyanka-surana added new module Adding a new module WIP Work in progress labels Jan 27, 2022
@mahesh-panchal
Copy link
Member

mahesh-panchal commented Feb 10, 2022

Something that may be of interest:
Nextflow changes the entrypoint for docker so the variable AUGUSTUS_CONFIG_PATH is not bound like it would using the normal container entrypoint.

Adding this code snippet may be useful, in combination with a check if the augustus config path from input is not passed as a variable.

    # Nextflow changes the container --entrypoint to /bin/bash (container default entrypoint: /usr/local/env-execute)
    # Check for container variable initialisation script and source it.
    if [ -f "/usr/local/env-activate.sh" ]; then
        set +u  # Otherwise, errors out because of various unbound variables
        . "/usr/local/env-activate.sh"
        set -u
    fi

modules/busco/main.nf Outdated Show resolved Hide resolved
modules/busco/main.nf Outdated Show resolved Hide resolved
tests/modules/busco/test.yml Outdated Show resolved Hide resolved
modules/busco/main.nf Outdated Show resolved Hide resolved
modules/busco/main.nf Outdated Show resolved Hide resolved
@priyanka-surana
Copy link
Contributor Author

priyanka-surana commented Feb 25, 2022

@mahesh-panchal I am not entirely clear on the changes you suggested. But take a look at the code, I have added them and tested locally. So far it works.

@mahesh-panchal
Copy link
Member

@mahesh-panchal I am not entirely clear on the changes you suggested. But take a look at the code, I have added them and tested locally. So far it works.

I'll have a proper look next week.

Copy link
Member

@mahesh-panchal mahesh-panchal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional comments. Thank you for the effort you're putting into this.

modules/busco/main.nf Outdated Show resolved Hide resolved
modules/busco/main.nf Outdated Show resolved Hide resolved
tests/modules/busco/main.nf Outdated Show resolved Hide resolved
@mahesh-panchal
Copy link
Member

@mahesh-panchal I am not entirely clear on the changes you suggested.

I'll explain what's happening in the docker container above.
The Dockerfile is written such that when it's started, either by docker run ( using default entry point) or with an interactive shell docker run ... bash that the file /usr/local/env-execute is executed. This in turn sources /usr/local/env-activate.sh which then sources a bunch of other files as seen above, to provide the environment variables necessary for busco.
However, Nextflow explicitly changes the container entry point from /usr/local/env-executeto /bin/bash which means all the files above aren't sourced when you run the container. So the fix is to source the files that are needed.
See this example on the original container as to how docker is run (this is what was in my .command.run) (ezlabgva/busco:v5.2.2_cv2 and the biocontainers container have the same entry point):

docker run -i --cpus 1.0 -e "NXF_DEBUG=${NXF_DEBUG:=0}" -v /workspace/pipelines-nextflow/work:/workspace/pipelines-nextflow/work -v "$PWD":"$PWD" -w "$PWD" --entrypoint /bin/bash -u "$( id -u ):$( id -g )" --name $NXF_BOXID ezlabgva/busco:v5.2.2_cv2 -c "/bin/bas
h /workspace/pipelines-nextflow/work/92/502733a8479e60e4e99467fa538b31/.command.run nxf_trace"

@nvlachos
Copy link

Hello All,
Thank you for your working on getting this new module complete! I was wondering if there has been any new progress done on this module lately? It seems as if many issues have been resolved and that it may be close to release? If you could identify the remaining tasks, our group would be willing to assist if that would be helpful as we are trying to put out a pipeline that utilizes this tool. Thanks!

@mahesh-panchal
Copy link
Member

If you have the time to address my review comments please do.

@priyanka-surana
Copy link
Contributor Author

@nvlachos I haven't worked on this lately. It is in the current state and as @mahesh-panchal suggested, if you address his comments, you should pretty much be ready to merge. Another thing to consider is to make updates to bring to the current tools version, this was built using v2.2.

@mahesh-panchal
Copy link
Member

Hi @jvhagey, I only just saw you were working on this on another pull request. We should merge our efforts.

@muffato
Copy link
Member

muffato commented May 3, 2022

Absolutely ! @jvhagey

@mahesh-panchal
Copy link
Member

This pull request is ready for review aside from the linting error. So let me know what you would like to include. I see you split the outputs, so are you using certain channels downstream?

@sateeshperi sateeshperi mentioned this pull request May 4, 2022
13 tasks
@mahesh-panchal
Copy link
Member

@jvhagey Please merge the module if you're happy with it.

@mahesh-panchal mahesh-panchal self-assigned this May 5, 2022
@mahesh-panchal mahesh-panchal added Ready for Review and removed WIP Work in progress labels May 5, 2022
Copy link
Contributor

@sateeshperi sateeshperi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gr8 job! Thanks to all the authors for pushing this module to the finish line 💯

Copy link
Contributor

@sateeshperi sateeshperi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should fix the linting

modules/busco/main.nf Outdated Show resolved Hide resolved
Co-authored-by: Sateesh Peri <33637490+sateeshperi@users.noreply.github.com>
modules/busco/main.nf Outdated Show resolved Hide resolved
@drpatelh
Copy link
Member

drpatelh commented May 5, 2022

Linting was failing due to the comments at the end and not the syntax 1289626

Please push an issue to nf-core tools with a reproducible example so we can fix!

Copy link
Contributor

@jvhagey jvhagey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few edits.

modules/busco/main.nf Show resolved Hide resolved
modules/busco/main.nf Outdated Show resolved Hide resolved
modules/busco/main.nf Show resolved Hide resolved
tests/modules/busco/main.nf Outdated Show resolved Hide resolved
@mahesh-panchal
Copy link
Member

Testing with PROFILE=docker pytest --tag busco --symlink --kwd --git-aware
busco logs show:

2022-05-06 12:35:03 DEBUG:busco.busco_tools.base        Tool: sepp
2022-05-06 12:35:03 DEBUG:busco.busco_tools.base        Version: 4.4.0
2022-05-06 12:35:03 INFO:busco.busco_tools.Toolset      Running 1 job(s) on sepp, starting at 05/06/2022 12:35:03
2022-05-06 12:35:03 DEBUG:busco.busco_tools.Toolset     cmd call: run_sepp.py --cpu 2 --outdir /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/test-auto-busco/genome.fna/auto_lineage/run_bacteria_odb10/placement_files -t /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/busco_downloads/placement_files/tree.bacteria_odb10.2019-12-16.nwk -r /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/busco_downloads/placement_files/tree_metadata.bacteria_odb10.2019-12-16.txt -a /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/busco_downloads/placement_files/supermatrix.aln.bacteria_odb10.2019-12-16.faa -f /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/test-auto-busco/genome.fna/auto_lineage/run_bacteria_odb10/placement_files/marker_genes.fasta -F 15 -m amino -p /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/test-auto-busco/genome.fna/auto_lineage/run_bacteria_odb10/sepp_tmp_files
2022-05-06 12:36:12 INFO:busco.busco_tools.Toolset      [sepp]  1 of 1 task(s) completed
2022-05-06 12:36:12 ERROR:busco.BuscoRunner     Placements failed. Try to rerun increasing the memory or select a lineage manually.

So out of memory, even though it ran with nf-core modules create-test-yml busco

Copy link
Contributor

@sateeshperi sateeshperi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! Thank you all for your contributions 🥇

@sateeshperi sateeshperi merged commit e13d634 into nf-core:master May 6, 2022
@muffato
Copy link
Member

muffato commented May 6, 2022

🎉 . Thank you everyone !

@muffato
Copy link
Member

muffato commented May 6, 2022

Testing with PROFILE=docker pytest --tag busco --symlink --kwd --git-aware busco logs show:

2022-05-06 12:35:03 DEBUG:busco.busco_tools.base        Tool: sepp
2022-05-06 12:35:03 DEBUG:busco.busco_tools.base        Version: 4.4.0
2022-05-06 12:35:03 INFO:busco.busco_tools.Toolset      Running 1 job(s) on sepp, starting at 05/06/2022 12:35:03
2022-05-06 12:35:03 DEBUG:busco.busco_tools.Toolset     cmd call: run_sepp.py --cpu 2 --outdir /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/test-auto-busco/genome.fna/auto_lineage/run_bacteria_odb10/placement_files -t /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/busco_downloads/placement_files/tree.bacteria_odb10.2019-12-16.nwk -r /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/busco_downloads/placement_files/tree_metadata.bacteria_odb10.2019-12-16.txt -a /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/busco_downloads/placement_files/supermatrix.aln.bacteria_odb10.2019-12-16.faa -f /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/test-auto-busco/genome.fna/auto_lineage/run_bacteria_odb10/placement_files/marker_genes.fasta -F 15 -m amino -p /tmp/pytest_workflow_k_fwdbvn/busco_test_busco_genome_single_fasta/work/63/1ae8fafa20dc10c72ca89f6e636457/test-auto-busco/genome.fna/auto_lineage/run_bacteria_odb10/sepp_tmp_files
2022-05-06 12:36:12 INFO:busco.busco_tools.Toolset      [sepp]  1 of 1 task(s) completed
2022-05-06 12:36:12 ERROR:busco.BuscoRunner     Placements failed. Try to rerun increasing the memory or select a lineage manually.

So out of memory, even though it ran with nf-core modules create-test-yml busco

FYI @mahesh-panchal , there is a similar bug report at https://gitlab.com/ezlab/busco/-/issues/555

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new module: BUSCO
7 participants